Genotype Error Detection Using Hidden Markov Models of Haplotype Diversity

نویسندگان

  • Justin Kennedy
  • Ion I. Mandoiu
  • Bogdan Pasaniuc
چکیده

The presence of genotyping errors can invalidate statistical tests for linkage and disease association, particularly for methods based on haplotype analysis. Becker et al. have recently proposed a simple likelihood ratio approach for detecting errors in trio genotype data. Under this approach, a SNP genotype is flagged as a potential error if the likelihood associated with the original trio genotype data increases by a multiplicative factor exceeding a user selected threshold when the SNP genotype under test is deleted. In this article we give improved error detection methods using the likelihood ratio test approach in conjunction with likelihood functions that can be efficiently computed based on a Hidden Markov Model of haplotype diversity in the population under study. Experimental results on both simulated and real datasets show that proposed methods have highly scalable running time and achieve significantly improved detection accuracy compared to previous methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Haplotype Inference Using a Hidden Markov Model with Efficient Markov Chain Sampling

Knowledge of haplotypes is useful for understanding block structures of the genome and finding genes associated with disease. Direct measurement of haplotypes in the absence of family data is presently impractical. Hence several methods have been developed previously for reconstructing haplotypes from population data. In this thesis, a new population-based method is developed using a Hidden Mar...

متن کامل

Genotype calling from next-generation sequencing data using haplotype information of reads

MOTIVATION Low coverage sequencing provides an economic strategy for whole genome sequencing. When sequencing a set of individuals, genotype calling can be challenging due to low sequencing coverage. Linkage disequilibrium (LD) based refinement of genotyping calling is essential to improve the accuracy. Current LD-based methods use read counts or genotype likelihoods at individual potential pol...

متن کامل

Introducing Busy Customer Portfolio Using Hidden Markov Model

Due to the effective role of Markov models in customer relationship management (CRM), there is a lack of comprehensive literature review which contains all related literatures. In this paper the focus is on academic databases to find all the articles that had been published in 2011 and earlier. One hundred articles were identified and reviewed to find direct relevance for applying Markov models...

متن کامل

Intrusion Detection Using Evolutionary Hidden Markov Model

Intrusion detection systems are responsible for diagnosing and detecting any unauthorized use of the system, exploitation or destruction, which is able to prevent cyber-attacks using the network package analysis. one of the major challenges in the use of these tools is lack of educational patterns of attacks on the part of the engine analysis; engine failure that caused the complete training,  ...

متن کامل

Joint haplotype phasing and genotype calling of multiple individuals using haplotype informative reads

MOTIVATION Hidden Markov model, based on Li and Stephens model that takes into account chromosome sharing of multiple individuals, results in mainstream haplotype phasing algorithms for genotyping arrays and next-generation sequencing (NGS) data. However, existing methods based on this model assume that the allele count data are independently observed at individual sites and do not consider hap...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of computational biology : a journal of computational molecular cell biology

دوره 15 9  شماره 

صفحات  -

تاریخ انتشار 2007